Skip to content

Feat: Implementing Streaming Generation for simply#7

Open
Rohit-Andhavarapu wants to merge 3 commits intogoogle-deepmind:mainfrom
Rohit-Andhavarapu:feat/streaming-gen
Open

Feat: Implementing Streaming Generation for simply#7
Rohit-Andhavarapu wants to merge 3 commits intogoogle-deepmind:mainfrom
Rohit-Andhavarapu:feat/streaming-gen

Conversation

@Rohit-Andhavarapu
Copy link

This PR improves usability significantly for researchers experimenting on streaming generation.

Summary :

in simply/model_lib.py

StreamingToken

  • New dataclass for yielded tokens (lines ~3238-3264)

decode_one_step()

  • New function for single-step decoding without while_loop (lines ~3937-4039)
    LMInterface.generate_stream() - New generator method that yields tokens as they're generated (lines ~3688-3913)

in simply/model_lib_test.py

  • Added test_lm_interface_generate_stream() test case

Usage :

from simply import model_lib

lm = model_lib.LMInterface(model, params, vocab=vocab)

for token in lm.generate_stream("Once upon a time"):
    print(token.token_text, end="", flush=True)
    if token.is_final:
        print()  # newline at end

@google-cla
Copy link

google-cla bot commented Jan 30, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant